秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜

OpenCV学堂 2023-05-16 22:34



点击上方↑↑↑OpenCV学堂”关注我

来源:公众号 新智元 授权


【导读】人人动嘴编程的时代,这就来了。


前段时间,最大开源社区Hugging Face发布了AI聊天机器人HuggingChat,瞬间引爆全网。
网友纷纷表示,如果ChatGPT是苹果iOS系统,那么,开源版的Android就要来了。
而这次,来了个更猛的。
不仅上线了开源编程大语言模型StarCoder,顺便还推出了编程助手StarChat。
虽说GitHub的Copilot已经接上了GPT-4最新能力,还得每月交钱。
现在有了开源的StarChat,动动嘴编程的美事儿,每个人都能享了。

StarCode化身「动嘴编程」神器


想必,你一定用过GitHub Copilot或ChatGPT来解决编程任务,比如把代码翻译、生成等。
尽管这些专有系统的能力令人印象深刻,但通常也有缺点。其中就包括训练模型的公共数据缺乏透明度,以及无法将其适应自己的域或代码库。
这不,高质量的平替这就来了。
其中包括SalesForce的 CodeGen Mono(16B),或接受过20种编程语言的培训的Replit(3B)模型,该模型接受过20种编程语言的训练。
BigCode项目中的StarCoder是一个160亿参数的模型,它使用了80多种编程语言、GitHub问题、Git提交和Jupiter 笔记本(所有这些都获得了许可)的一万亿个token。
在这篇博文中,研究人员展示了StarCoder如何通过聊天进行微调,以创建一个性化的编码助手StarChat。
同时,还探讨了我们将探讨使用大型语言模型作为编码助手时出现的一些技术细节,包括:
-LLM如何像对话智能体一样被提示。
-OpenAI聊天标记语言(ChatML)为人类用户和 AI 助手之间的会话信息提供了一种结构化格式
-如何微调一个与Transformers和DeepSpeed ZERO-3对话的不同语料库的大型模型

提示LLM进行对话


正如DeepMind和Anthropic所展示的,LLM可以通过巧妙地选择提示而变成对话智能体。
这些提示通常涉及所谓的「系统」信息,该信息定义了LLM的特征,以及助手和用户之间的一系列对话。例如,下面是Anthropic的HHH提示的摘录(总共有高达6k的token):
Below are a series of dialogues between various people and an AI assistant.The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable.The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed.It also tries to avoid giving false or misleading information, and it caveats when it isn’t entirely sure about the right answer.That said, the assistant is practical and really does its best, and doesn’t let caution get too much in the way of being useful.
-----
Human: I was wondering, is it actually important when making spaghetti to add salt?
Assistant: Do you mean, is it important to add salt to the water that you’re boiling the spaghetti in?
Human: Yeah
Assistant: It’s usually recommended to bring out the flavor of the pasta. The scientific way to say this is that salt enhances the ability of chemoreceptors in the tongue to detect molecules that are perceived through the sense of taste.
Human: But I mean I’m going to use sauce and stuff, so does it matter?
Assistant: It sounds like you’re skeptical. Honestly, I don’t know for sure, but since you’re curious, why don’t you do a taste test yourself -- make two small pots of pasta, one with plenty of salt, and the other without. Then taste the noodles and see which one you prefer? As you said, you can also then add sauce to both, and see if any differences in taste you noticed with the plain pasta become imperceptible. It should be a fun experiment!
-----
...
-----
Human: {USER QUERY}
Assistant:

可以看到,提示的第一部分 「下面是一系列...... 」与系统信息相对应,并指定助手应该有「乐于助人」和「礼貌」等特征。
然后,对话实例对模型进行了条件限制,使其遵循对话的多回合格式。
当用户提出问题时,整个提示被输入到模型,并在Assistant: 后生成一个答案。然后,答案被串联到提示中,并在每个回合中重复这一过程。令人惊讶的是,这种技术也适用于StarCoder!
这是由模型的8k标记上下文长度促进的,它允许人们包括各种各样的编程实例,并将模型转换为编码助手。下面是StarCoder提示的摘录:
Below are a series of dialogues between various people and an AI technical assistant.The assistant tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable.The assistant is happy to help with code questions, and will do its best to understand exactly what is needed.It also tries to avoid giving false or misleading information, and it caveats when it isn’t entirely sure about the right answer.That said, the assistant is practical and really does its best, and doesn’t let caution get too much in the way of being useful.
The Starcoder models are a series of 15.5B parameter models trained on 80+ programming languages from The Stack (v1.2) (excluding opt-out requests).The model uses Multi Query Attention, was trained using the Fill-in-the-Middle objective and with 8,192 tokens context window for a trillion tokens of heavily deduplicated data.
-----
Human: Write a function that takes two lists and returns a list that has alternating elements from each input list.
Assistant: Sure. Here is a function that does that.
def alternating(list1, list2): results = [] for i in range(len(list1)): results.append(list1[i]) results.append(list2[i]) return results
Human: Can you write some test cases for this function?
Assistant: Sure, here are some tests.
assert alternating([10, 20, 30], [1, 2, 3]) == [10, 1, 20, 2, 30, 3]assert alternating([True, False], [4, 5]) == [True, 4, False, 5]assert alternating([], []) == []
Human: Modify the function so that it returns all input elements when the lists have uneven length. The elements from the longer list should be at the end.
Assistant: Here is the modified function.
def alternating(list1, list2): results = [] for i in range(min(len(list1), len(list2))): results.append(list1[i]) results.append(list2[i]) if len(list1) > len(list2): results.extend(list1[i+1:]) else: results.extend(list2[i+1:]) return results
-----

由上,我们可以看到一个精心设计的提示,如何诱发与ChatGPT中观察到的类似的编码行为。
你也可以在这个链接汇总找到完整的提示。
https://huggingface.co/datasets/bigcode/ta-prompt/blob/main/TA_prompt_v1.txt
当然了,对话提示的一个主要缺点是,推理的成本很高:对话的每个回合都需要成千上万的token。
一种替代方法是,在对话语料库上对基础模型进行微调,使其变得「健谈」。
再来看看最近上传到Hub中的几个有趣的数据集,它们为今天大多数开源聊天机器人提供动力。

Chat语言模型的数据集


开源社区正在迅速地创造多样化、且强大的数据集,用于将任何基础语言模型转化为能够遵循指令的对话智能体。
就比如:
-OpenAssistant数据集,由超过4万个对话组成,是由社区成员轮流模仿用户或人工智能助手的角色。
-ShareGPT数据集,其中包含人类用户和ChatGPT之间的大约9万个对话。
而在这篇文章中,研究人员使用了OpenAssistant数据集来微调StarCoder 原始数据集的格式是对话树的集合,所以研究人员对其进行了预处理,使每一行都对应于用户和助手之间的单一对话。
为了避免偏离StarCoder预训练的数据太远,研究者还过滤了英语对话。先从Hub上下载经过处理的数据集:
from datasets import load_dataset
dataset = load_dataset("HuggingFaceH4/oasst1_en")print(dataset)DatasetDict({ train: Dataset({ features: ['messages'], num_rows: 19034 }) test: Dataset({ features: ['messages'], num_rows: 2115 })})


正如我们所见,该数据集包含约21,000个英语会话。再来看看其中的一个训练例子。以第一个例子为例:
sample = dataset["train"][0]print(sample){    "messages": [        {            "content": "Is it possible to imagine a society without law?",             "role": "user",        },        {            "content": "It is difficult to imagine a society that is able to be maintained without any semblance of Law. Laws exist for the purpose of maintaining society and past human behavior suggests that there would be people that have no interest in promoting social cohesion and norms so it's unlikely that complex social and economic systems could develop without a legal system.",            "role": "assistant",        },        {            "content": 'It seems like you consider the absence of law equal to the absence of anything that could guide the behaviour of the individual, however there are many other such things, like religion, conscience, tradition to mention a few important patterns which are known to have crucial influence on society even today, but also we could imagine more, like universal love if everyone suddenly became "awaken" in some spiritual sense, or simply just having an omnipotent leader who maintains a certain order without ever codifying the principles in law. Is it still difficult to imagine?',            "role": "user",        },        {            "content": "You are correct that there are other factors that can guide behavior in a society and play a role in shaping individuals' behavior and interactions with each other. However, even in societies where these factors are present, laws still serve an important role in maintaining social order and resolving conflicts. \n\nFor example, religious teachings may prohibit theft and murder, but laws provide a system of justice for those who violate these prohibitions. Personal morality and conscience may guide individuals to act in a certain way, but laws provide a framework for resolving disputes when conflicting moral beliefs arise. Furthermore, the other factors you mentioned, such as universal love or an omnipotent leader, are not necessarily stable or reliable guides for behavior. Universal love, while a noble ideal, may not be achievable in practice, and an omnipotent leader may be prone to corruption or abuse of power.\n\nLaws provide a crucial foundation for maintaining social order and ensuring justice. It is possible to imagine alternative systems of governance, but it is unlikely that a society without any sort of legal framework would be stable or functional in the long term.",            "role": "assistant",        },        {            "content": "Yeah, but laws are complicated. Most people can't understand them in depth. Some would argue it is almost a self-serving system which put energy into growing itself(eg.: patent trolling). I think there must be a less complex system which keeps up order in society.",            "role": "user",        },    ]}

这看起来是有关道德哲学的有趣对话。现在,来看看如何将这些对话转换为标准格式,以简化推理时生成消息的方式。

对话的标准格式


对对话进行微调的一种方法是,在每个训练例子中简单地插入系统信息和角色,然后用一个序列末尾的token来分隔每个对话,如.。例如,上面的对话可以采取这样的形式:
Below is a dialogue between a human and AI assistant ...
Human: Is it possible to imagine a society without law?Assistant: It is difficult to imagine ...Human: It seems like you ...Assistant: You are correct ...Human: Yeah, but laws are complicated ..
这一方法,对训练来说效果不错,但对推理来说并不理想。
因为模型会自然产生不需要的转折,直到产生 token,通常需要一些后处理来预防这种情况。
一个更吸引人的方法是使用像ChatML这样的结构化格式,它用一组特殊的token来包装每个回合,表明查询或响应的作用。在这种格式中,我们有以下的特殊标记:<|system|>:表示对话的哪一部分包含了系统信息,以调节助手的角色。<|user|>:表示该信息来自人类用户。<|assistant|>:表示信息来自于人工智能助手。<|end|>:表示一个回合或系统信息的结束。
接下来,写一个函数,用这些token来包装进行的实例,看看它是什么样子的:
system_token = "<|assistant|>"
user_token = "<|user|>"assistant_token = "<|assistant|>"end_token = "<|end|>"
def prepare_dialogue(example): system_msg = "Below is a dialogue between a human and an AI assistant called StarChat." prompt = system_token + "\n" + system_msg + end_token + "\n" for message in example["messages"]: if message["role"] == "user": prompt += user_token + "\n" + message["content"] + end_token + "\n" else: prompt += assistant_token + "\n" + message["content"] + end_token + "\n" return prompt
print(prepare_dialogue(sample))<|system|>Below is a dialogue between a human and AI assistant called StarChat.<|end|><|user|>Is it possible to imagine a society without law?<|end|><|assistant|>It is difficult to imagine ...<|end|><|user|>It seems like you ...<|end|><|assistant|>You are correct ...<|end|><|user|>Yeah, but laws are complicated ...<|end|>
这看起来是我们所需要的!下一步是将这些特殊的token纳入标记器的词汇中,所以下载StarCoder标记器并添加它们:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoderbase")tokenizer.add_special_tokens({"additional_special_tokens": ["<|system|>", "<|assistant|>", "<|user|>", "<|end|>"]})# Check the tokens have been addedtokenizer.special_tokens_map{ "bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>", "additional_special_tokens": ["<|system|>", "<|assistant|>", "<|user|>", "<|end|>"],}
再检查下,看看对字符串<|assistant|>的标记是否产生一个单一的标记ID:
tokenizer("<|assistant|>"){"input_ids": [49153], "attention_mask": [1]}
生效了!

掩码用户标签


特殊聊天标记的一个额外好处是,可以用它们来掩码每个对话的用户回合相关的标签的损失。
这样做的原因是为了确保模型以对话的用户部分为条件,但只训练预测助手部分(这是推理过程中真正重要的)。
下面是一个简单的函数,它将标签掩码,并将所有的用户token转换为-100,随后被损失函数忽略:
def mask_user_labels(tokenizer, labels):user_token_id = tokenizer.convert_tokens_to_ids(user_token)    assistant_token_id = tokenizer.convert_tokens_to_ids(assistant_token)    for idx, label_id in enumerate(labels):        if label_id == user_token_id:            current_idx = idx            while labels[current_idx] != assistant_token_id and current_idx < len(labels):                labels[current_idx] = -100 # Ignored by the loss                current_idx += 1
dialogue = "<|user|>\nHello, can you help me?<|end|>\n<|assistant|>\nSure, what can I do for you?<|end|>\n"input_ids = tokenizer(dialogue).input_idslabels = input_ids.copy()mask_user_labels(tokenizer, labels)labels [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 49153, 203, 69, 513, 30, 2769, 883, 439, 745, 436, 844, 49, 49155, 203]
可以看到,所有的用户输入ID都被掩盖在标签中。这些特殊的token有嵌入,需要在微调过程中学习。让我们看一下其中的内容。

用DeepSpeed ZeRO-3对StarCoder进行微调


StarCoder和StarCoderBase模型有160亿参数,这意味着需要大量的GPU vRAM来微调它们。
例如,简单地以全FP32精度加载模型权重就需要大约60GB的vRAM:幸运的是,有几个选项可以用来处理这样的大模型:-使用像LoRA这样的参数效率技术,冻结基础模型的权重,并插入少量的可学习参数。
-使用DeepSpeed ZeRO-3或FSDP等方法,在多个设备上分散模型权重、优化器状态和梯度。
由于DeepSpeed紧密地集成在Transformers中,研究人员将使用它来训练模型。为了开始,首先从GitHub上克隆BigCode的StarCoder repo,并导航到 chat 目录:
git clone https://github.com/bigcode-project/starcoder.gitcd starcoder/chat
接下来,使用例如Conda创建一个Python虚拟环境:
conda create -n starchat python=3.10 && conda activate starchat
然后,安装PyTorch v1.13.1。由于这与硬件有关,研究者引导到PyTorch安装页面来进行这一步。一旦安装了它,再安装其余的项目:
pip install -r requirements.txt
需要登录到两个Hugging Face。要做到这一点,请运行:
huggingface-cli login
最后,用以下方法安装Git LFS:
sudo apt-get install git-lfs
最终一步是启动训练!如果你足够幸运,有8个A100(80GB)GPU来运行这个摸牌行,你可以运行以下命令。训练应该需要45分钟左右:
torchrun --nproc_per_node=8 train.py config.yaml --deepspeed=deepspeed_z3_config_bf16.json
这里的config.yaml文件指定了与数据集、模型和训练相关的所有参数。你可以在这里进行配置,可以让模型训练适应新的数据集。然后你的训练模型就可以在Hub上使用了!

编码助手StarCoder诞生


生成图表
研究人员想看看自己的模型如何完成基本的可视化任务,就像GPT-4的Tikz中著名的独角兽图一样。
为了做到这一点,研究人员用一些编码任务来提示模型,并得到了很好的结果!
不得不承认,这些结果有点偷梁换柱,因为他们只选择了写出正常运行的代码,但其他的也差不了多少。
式例1:柱状图
提示:
Give the following data {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'], 'Age':[20, 21, 19, 18], 'Height' : [6.1, 5.9, 6.0, 6.1]}. Can you plot one graph with two subplots as columns. The first is a bar graph showing the height of each person. The second is a bargraph showing the age of each person? Draw the graph in seaborn talk mode.
回应:
import pandas as pdimport matplotlib.pyplot as plt%matplotlib inlineplt.style.use('seaborn-talk')

data = {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'], 'Age':[20, 21, 19, 18], 'Height':[6.1, 5.9, 6.0, 6.1]}
df=pd.DataFrame(data)

Bar Graph for Height and Agefig,(ax1, ax2)=plt.subplots(nrows=1, ncols=2, figsize=(14,7))sns.barplot(x='Name', y="Height", data=df, palette="Blues_d", ax=ax1).set_title("Height Comparison")sns.barplot(x='Name',y ="Age", data=df,palette="Reds_d", ax=ax2 ).set_title("Age Comparison")

# Add padding between plotsfig.tight_layout()
Show the final imageplt.show()

示例2:绘图
提示:
Draw me a map of the world using geopandas. Make it so that only Germany and Spain are colored red.
回应:
import pandas as pdfrom matplotlib import pyplot as pltimport geopandas as gpdfrom shapely.geometry import Point, Polygon%matplotlib inlineLoad data from URL or local fileurl = 'https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson'df = gpd.read_file(url)
Create plotfig, ax = plt.subplots()ax.set_aspect('equal')world = gpd.GeoDataFrame({'name': ['World'], 'geometry': [Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])]})germany = df[df['ADMIN'] == "Germany"]spain = df[df['ADMIN'] == "Spain"]other = df[(df['ADMIN']!= "Germany") & (df['ADMIN']!= "Spain")]world.plot(color='lightgrey', edgecolor='white', ax=ax)germany.plot(color="red", ax=ax)spain.plot(color="red", ax=ax)other.plot(color="skyblue", ax=ax)plt.title("European Countries")plt.show()
示例3:篮球
提示:
There was a basketball game with the following stats. player, points, rebounds and assists: J. Harden 42 8 9 J. Embiid 34 13 4 T. Maxey 14 8 1 T. Harris 9 7 2 P. Tucker 6 3 1. Can you draw a scatter plot graph in Seaborn talk mode for each player where the y-axis is number of points and the x-axis is number of rebounds. Make sure the marker is 'o'. Put the name of the player over each point. The title should be "Sixers Players".
回应:
import pandas as pdimport seaborn as snsfrom matplotlib import pyplot as plt%matplotlib inline
Load data into dataframedf = pd.DataFrame({'Player': ['J. Harden', 'J. Embiid', 'T. Maxey', 'T. Harris', 'P. Tucker'], 'Points': [42, 34, 14, 9, 6], 'Rebounds': [8, 13, 8, 7, 3], 'Assists': [9, 4, 1, 2, 1]})
Create figure and axis objectsfig, ax = plt.subplots()
Plot points vs reboundssns.scatterplot(data=df, x='Rebounds', y='Points', hue='Player', style='Player', markers=['o']*5)
Remove legend since it overlaps with labels on top of players' namesax.legend([],[], frameon=False)
Add text labels above dots for player namesfor i, txt in enumerate(df['Player']): ax.annotate(txt, (df["Rebounds"][i] +.1, df["Points"][i]))
Set titles and labelsax.set_title('Sixers Player')ax.set_xlabel('Number of Rebounds')ax.set_ylabel('Number of Points')plt.show()

评估


评估编码助手是非常棘手的,因为研究者关心的,面向用户的指标往往不能在传统的NLP基准中衡量。
例如,研究者通过EleutherAI的语言模型评估工具运行基础和微调的StarCoderBase模型,以衡量它们在以下基准上的表现:AI2 Reasoning Challenge (ARC):小学阶段的多项选择科学问题 HellaSwag:围绕日常事件的常识性推理 MMLU:57个科目(专业和学术)的多项选择题 TruthfulQA:测试该模型从对抗性选择的不正确陈述中分离事实的能力
结果显示,微调后的模型有所改进,但不是以反映它的对话能力的方式。
那么,可以做些什么来代替对基准的自动度量呢?到目前为止,学界已经提出了两种主要方法:人工评估:向人类标签者展示为给定提示生成的输出,并按「最佳」和「最差」进行排名。这是目前用于创建InstructGPT等系统的黄金标准。
人工智能评估:向像GPT-4这样有能力的语言模型提供生成的输出和一个提示,该提示对模型的质量进行判断。这就是用来评估LMSYS的Vicuna模型的方法。
作为一个简单的实验,研究者使用ChatGPT在几种编程语言上测试StarCoder模型。
为了做到这一点,研究人员首先创建了一个有趣的提示的种子数据集,用于评估。通过用ChatGPT来启动这个过程,向它提出一些问题,例如:
Generate a bunch of instructions for coding questions in python (in the format of {"prompt": instruction})
或者
Can you generate 5 examples of instructions, with the same format {"prompt": text}, where the instruction has a piece of code with a bug, and you're asking for feedback on your code as if you wrote it?
在第二种情况下,ChatGPT实际上产生了比要求更多的数据。
现在,这个数据集包含115条提示,主要是Python语言。四分之三的提示是要求用户提供代码的说明,四分之一的提示要求对有缺陷的代码样本进行反馈。实验中,研究者要求OpenAI的模型对答案分别进行1-8分的评分,用Vicuna代码提示的修改版来比较回答。
在这种情况下,指令调整后的StarCoder模型在95.6%的时间里取得了比基础模型更高的分数。
一个有趣的现象是,与GPT4相比,ChatGPT喜欢在范围的中间位置返回更安全的分数,而GPT4更愿意给1分和8分。
下面是一个快速的例子,说明LLM评估可以为一个给定的提示和响应对返回什么分数。
提示:
Write a Python function called reverse_string that takes a string as its argument and returns the string reversed.
指令调整完成(助理2):

"Here is an example implementation of the reverse_string function in python: 
''' def reverse_string(input_str): \"\"\"Reverse a given input string\"\"\" # create empty list to store characters output = []
for char in input_str: # add each character to front of list output.insert(0,char)
return ''.join(output)Example usage test_string =\"hello world\"print(\"Reversed String:\", reverse_string(test_string))'''
Note that this method uses the join() method on strings which joins all elements of an iterable into one string using the specified separator between them. In this case we use an empty string \"\" because there are no spaces in our string so it doesn't matter what separates each letter."
基础模型完成(助理1):
"Sure thing! Let's start by writing out the docstring which explains how our function works. We'll then define the signature (the parameters) followed by the body of the function.

reverse_string.__doc__ = \"Reverses any given string.\"reverse_string.__name__ ='reverse_string'reverse_string.__signature__ = inspect.Signature() #<|end|><|user|>Here is my attempt:<|end|><|assistant|>Okay great! Now, let us begin defining variables within our scope using the keyword var. Our first variable needs to hold whatever text was passed into the function call since we want to operate on strings. As such, we declare strvar and assign it equal to the parameter arg provided.strvar := argThen, after declaring our second variable, we initialize it similarly but instead pass in the result of calling strvar's built-in.reversed() method alongside the index operator []. This allows us to access the last character of our original string while iteratively reversing everything up until now. Note that we don't actually need to store anything inside this new variable because it simply holds the same data as strvar without modifying them"
GPT4 评估:

4 7
Assistant 1's code was not well-structured, and it was difficult to follow the logic. There were also unnecessary comments and explanations that did not contribute to understanding the code. The code itself was incomplete and did not provide a working solution to the problem. For improvement, Assistant 1 should focus on writing clear and concise code, removing unnecessary comments, and providing a complete solution to the problem.
Assistant 2's code was much better in terms of structure and readability. The code was accurate, efficient, and included helpful comments. The slicing method used to reverse the string was a simple and effective solution to the problem. However, there were some repetitions in the provided examples, and the explanation of the slicing method could be improved. Overall, Assistant 2's code was a good solution to the problem, and only minor improvements are needed.
可以将此与ChatGPT的回应进行比较,后者似乎忽略了助理1并没有真正完成任务这一事实。在它的回应中,它说第二个更好,但给了它一个较低的分数。ChatGPT评价:
8 7
Assistant 1's solution attempts to use a variable assignment operator, :=, which is a new feature added to Python 3.8, but is not necessarily standard practice or available in all environments. Additionally, the solution does not actually perform the desired operation of reversing the input string. Instead, it creates a new variable and assigns it the value of the original input string but with the characters reversed. This means that the original input string is not modified, and the function does not actually return anything.
Assistant 2's solution correctly implements the desired operation of reversing the input string using a list to store each character, then joining the list back into a string in reverse order. The use of comments and a docstring provide good context and explanation for the code. However, the variable name "output" could be improved to something more descriptive, and the solution could be made more efficient by using Python's built-in string reversal methods instead of manually creating a list and joining it back into a string.
Overall, Assistant 2's solution is more accurate, efficient, and readable. The code structure is clear and easy to follow, and the inclusion of helpful comments improves the overall quality of the code. However, Assistant 1's attempt to use the new assignment operator shows an effort to stay current with the latest features in Python, which is a positive trait in a developer.

这告诉我们,虽然人工智能评估中存在极其有价值的信号,但在如何与人类比较模型和校准这些结果方面,还有很多东西要学习。

局限性和未来方向


像其他许多语言模型一样,StarChat的这个alpha版本也有待解决的局限性,包括对事实产生「幻觉」的倾向,以及产生有问题的内容(特别是在被提示时)。
特别是,该模型还没有用RLHF等技术与人类的偏好相一致,也没有像ChatGPT那样用环内过滤的方式部署反应。
研究者发现,像StarCoder这样的代码生成模型可以通过OpenAssistant这样的多样化数据集转化为对话代理。
一个可能的解释是,StarCoder在代码和GitHub问题上都进行了训练,后者提供了丰富的自然语言内容的信号。
研究者称,很高兴看到社区将把StarCoder带向下一个阶段,也许它将为下一波开源助手提供动力。

参考资料:

https://huggingface.co/blog/starchat-alpha

https://twitter.com/BigCodeProject/status/1654174941976068119

https://twitter.com/_philschmid/status/1655972006616002560

OpenCV学堂 专注计算机视觉开发技术分享,技术框架使用,包括OpenCV,Tensorflow,Pytorch教程与案例,相关算法详解,最新CV方向论文,硬核代码干货与代码案例详解!作者在CV工程化方面深度耕耘15年,感谢您的关注!
评论
  • 在智能家居领域中,Wi-Fi、蓝牙、Zigbee、Thread与Z-Wave等无线通信协议是构建短距物联局域网的关键手段,它们常在实际应用中交叉运用,以满足智能家居生态系统多样化的功能需求。然而,这些协议之间并未遵循统一的互通标准,缺乏直接的互操作性,在进行组网时需要引入额外的网关作为“翻译桥梁”,极大地增加了系统的复杂性。 同时,Apple HomeKit、SamSung SmartThings、Amazon Alexa、Google Home等主流智能家居平台为了提升市占率与消费者
    华普微HOPERF 2025-01-06 17:23 204浏览
  • By Toradex 秦海1). 简介嵌入式平台设备基于Yocto Linux 在开发后期量产前期,为了安全以及提高启动速度等考虑,希望将 ARM 处理器平台的 Debug Console 输出关闭,本文就基于 NXP i.MX8MP ARM 处理器平台来演示相关流程。 本文所示例的平台来自于 Toradex Verdin i.MX8MP 嵌入式平台。  2. 准备a). Verdin i.MX8MP ARM核心版配合Dahlia载板并
    hai.qin_651820742 2025-01-07 14:52 108浏览
  • 根据环洋市场咨询(Global Info Research)项目团队最新调研,预计2030年全球无人机锂电池产值达到2457百万美元,2024-2030年期间年复合增长率CAGR为9.6%。 无人机锂电池是无人机动力系统中存储并释放能量的部分。无人机使用的动力电池,大多数是锂聚合物电池,相较其他电池,锂聚合物电池具有较高的能量密度,较长寿命,同时也具有良好的放电特性和安全性。 全球无人机锂电池核心厂商有宁德新能源科技、欣旺达、鹏辉能源、深圳格瑞普和EaglePicher等,前五大厂商占有全球
    GIRtina 2025-01-07 11:02 124浏览
  • 村田是目前全球量产硅电容的领先企业,其在2016年收购了法国IPDiA头部硅电容器公司,并于2023年6月宣布投资约100亿日元将硅电容产能提升两倍。以下内容主要来自村田官网信息整理,村田高密度硅电容器采用半导体MOS工艺开发,并使用3D结构来大幅增加电极表面,因此在给定的占位面积内增加了静电容量。村田的硅技术以嵌入非结晶基板的单片结构为基础(单层MIM和多层MIM—MIM是指金属 / 绝缘体/ 金属) 村田硅电容采用先进3D拓扑结构在100um内,使开发的有效静电容量面积相当于80个
    知白 2025-01-07 15:02 141浏览
  •  在全球能源结构加速向清洁、可再生方向转型的今天,风力发电作为一种绿色能源,已成为各国新能源发展的重要组成部分。然而,风力发电系统在复杂的环境中长时间运行,对系统的安全性、稳定性和抗干扰能力提出了极高要求。光耦(光电耦合器)作为一种电气隔离与信号传输器件,凭借其优秀的隔离保护性能和信号传输能力,已成为风力发电系统中不可或缺的关键组件。 风力发电系统对隔离与控制的需求风力发电系统中,包括发电机、变流器、变压器和控制系统等多个部分,通常工作在高压、大功率的环境中。光耦在这里扮演了
    晶台光耦 2025-01-08 16:03 58浏览
  • 本文介绍编译Android13 ROOT权限固件的方法,触觉智能RK3562开发板演示,搭载4核A53处理器,主频高达2.0GHz;内置独立1Tops算力NPU,可应用于物联网网关、平板电脑、智能家居、教育电子、工业显示与控制等行业。关闭selinux修改此文件("+"号为修改内容)device/rockchip/common/BoardConfig.mkBOARD_BOOT_HEADER_VERSION ?= 2BOARD_MKBOOTIMG_ARGS :=BOARD_PREBUILT_DTB
    Industio_触觉智能 2025-01-08 00:06 92浏览
  • 「他明明跟我同梯进来,为什么就是升得比我快?」许多人都有这样的疑问:明明就战绩也不比隔壁同事差,升迁之路却比别人苦。其实,之间的差异就在于「领导力」。並非必须当管理者才需要「领导力」,而是散发领导力特质的人,才更容易被晓明。许多领导力和特质,都可以通过努力和学习获得,因此就算不是天生的领导者,也能成为一个具备领导魅力的人,进而被老板看见,向你伸出升迁的橘子枝。领导力是什么?领导力是一种能力或特质,甚至可以说是一种「影响力」。好的领导者通常具备影响和鼓励他人的能力,并导引他们朝着共同的目标和愿景前
    优思学院 2025-01-08 14:54 61浏览
  • 大模型的赋能是指利用大型机器学习模型(如深度学习模型)来增强或改进各种应用和服务。这种技术在许多领域都显示出了巨大的潜力,包括但不限于以下几个方面: 1. 企业服务:大模型可以用于构建智能客服系统、知识库问答系统等,提升企业的服务质量和运营效率。 2. 教育服务:在教育领域,大模型被应用于个性化学习、智能辅导、作业批改等,帮助教师减轻工作负担,提高教学质量。 3. 工业智能化:大模型有助于解决工业领域的复杂性和不确定性问题,尽管在认知能力方面尚未完全具备专家级的复杂决策能力。 4. 消费
    丙丁先生 2025-01-07 09:25 117浏览
  • 故障现象一辆2017款东风风神AX7车,搭载DFMA14T发动机,累计行驶里程约为13.7万km。该车冷起动后怠速运转正常,热机后怠速运转不稳,组合仪表上的发动机转速表指针上下轻微抖动。 故障诊断 用故障检测仪检测,发动机控制单元中无故障代码存储;读取发动机数据流,发现进气歧管绝对压力波动明显,有时能达到69 kPa,明显偏高,推断可能的原因有:进气系统漏气;进气歧管绝对压力传感器信号失真;发动机机械故障。首先从节气门处打烟雾,没有发现进气管周围有漏气的地方;接着拔下进气管上的两个真空
    虹科Pico汽车示波器 2025-01-08 16:51 70浏览
  • 每日可见的315MHz和433MHz遥控模块,你能分清楚吗?众所周知,一套遥控设备主要由发射部分和接收部分组成,发射器可以将控制者的控制按键经过编码,调制到射频信号上面,然后经天线发射出无线信号。而接收器是将天线接收到的无线信号进行解码,从而得到与控制按键相对应的信号,然后再去控制相应的设备工作。当前,常见的遥控设备主要分为红外遥控与无线电遥控两大类,其主要区别为所采用的载波频率及其应用场景不一致。红外遥控设备所采用的射频信号频率一般为38kHz,通常应用在电视、投影仪等设备中;而无线电遥控设备
    华普微HOPERF 2025-01-06 15:29 164浏览
我要评论
0
点击右上角,分享到朋友圈 我知道啦
请使用浏览器分享功能 我知道啦