模型蒸馏
模型蒸馏是指训练一个(通常较小的)学生模型来模仿一个(通常较大的)教师模型或教师模型集合的行为。这通常用于使模型更快、更便宜、更轻量。
Cross Encoder 知识蒸馏
目标是在相同的输入对(通常是查询-答案对)上,最小化学生模型逻辑(即原始模型输出)与教师模型逻辑之间的差异。
这里有两个训练脚本,它们使用了 Hostätter 等人预先计算的逻辑,他们在 MS MARCO 数据集上训练了一个由 3 个(大型)模型组成的集成模型,并预测了各种(查询、段落)对(50% 正样本,50% 负样本)的分数。
-
在此示例中,我们使用知识蒸馏,通过一个小型快速模型学习教师集成模型的逻辑分数。这使得性能与大型模型相当,同时速度快了 18 倍。
它使用
MSELoss
来最小化预测的学生逻辑与(查询,答案)对的预计算教师逻辑之间的距离。 train_cross_encoder_kd_margin_mse.py
这与上一个脚本的设置相同,但现在使用前面提到的 Hostätter 等人中使用的
MarginMSELoss
。MarginMSELoss
不适用于(查询,答案)对和预计算的逻辑,而是适用于(查询,正确答案,不正确答案)三元组以及对应于teacher.predict([query, correct_answer]) - teacher.predict([query, incorrect_answer])
的预计算逻辑。简而言之,这个预计算的逻辑是(查询,正确答案)和(查询,不正确答案)之间的差值。
推理
tomaarsen/reranker-MiniLM-L12-H384-margin-mse模型是用第二个脚本训练的。如果你想在自己蒸馏模型之前试用该模型,请随意使用此脚本
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-modernbert-base-msmarco-margin-mse")
# Get scores for pairs of texts
pairs = [
["where is joplin airport", "Scott Joplin is important both as a composer for bringing ragtime to the concert hall, setting the stage (literally) for the rise of jazz; and as an early advocate for civil rights and education among American blacks. Joplin is a hero, and a national treasure of the United States."],
["where is joplin airport", "Flights from Jos to Abuja will get you to this shimmering Nigerian capital within approximately 19 hours. Flights depart from Yakubu Gowon Airport/ Jos Airport (JOS) and arrive at Nnamdi Azikiwe International Airport (ABV). Arik Air is the main airline flying the Jos to Abuja route."],
["where is joplin airport", "Janis Joplin returned to the music scene, knowing it was her destiny, in 1966. A friend, Travis Rivers, recruited her to audition for the psychedelic band, Big Brother and the Holding Company, based in San Francisco. The band was quite big in San Francisco at the time, and Joplin landed the gig."],
["where is joplin airport", "Joplin Regional Airport. Joplin Regional Airport (IATA: JLN, ICAO: KJLN, FAA LID: JLN) is a city-owned airport four miles north of Joplin, in Jasper County, Missouri. It has airline service subsidized by the Essential Air Service program. Airline flights and general aviation are in separate terminals."],
["where is joplin airport", 'Trolley and rail lines made Joplin the hub of southwest Missouri. As the center of the "Tri-state district", it soon became the lead- and zinc-mining capital of the world. As a result of extensive surface and deep mining, Joplin is dotted with open-pit mines and mine shafts.'],
]
scores = model.predict(pairs)
print(scores)
# [0.00410349 0.03430534 0.5108879 0.999984 0.91639173]
# Or rank different texts based on similarity to a single text
ranks = model.rank(
"where is joplin airport",
[
"Scott Joplin is important both as a composer for bringing ragtime to the concert hall, setting the stage (literally) for the rise of jazz; and as an early advocate for civil rights and education among American blacks. Joplin is a hero, and a national treasure of the United States.",
"Flights from Jos to Abuja will get you to this shimmering Nigerian capital within approximately 19 hours. Flights depart from Yakubu Gowon Airport/ Jos Airport (JOS) and arrive at Nnamdi Azikiwe International Airport (ABV). Arik Air is the main airline flying the Jos to Abuja route.",
"Janis Joplin returned to the music scene, knowing it was her destiny, in 1966. A friend, Travis Rivers, recruited her to audition for the psychedelic band, Big Brother and the Holding Company, based in San Francisco. The band was quite big in San Francisco at the time, and Joplin landed the gig.",
"Joplin Regional Airport. Joplin Regional Airport (IATA: JLN, ICAO: KJLN, FAA LID: JLN) is a city-owned airport four miles north of Joplin, in Jasper County, Missouri. It has airline service subsidized by the Essential Air Service program. Airline flights and general aviation are in separate terminals.",
'Trolley and rail lines made Joplin the hub of southwest Missouri. As the center of the "Tri-state district", it soon became the lead- and zinc-mining capital of the world. As a result of extensive surface and deep mining, Joplin is dotted with open-pit mines and mine shafts.',
],
)
print(ranks)
# [
# {"corpus_id": 3, "score": 0.999984},
# {"corpus_id": 4, "score": 0.91639173},
# {"corpus_id": 2, "score": 0.5108879},
# {"corpus_id": 1, "score": 0.03430534},
# {"corpus_id": 0, "score": 0.004103488},
# ]