| 2 | 1/1 | 返回列表 |
| 查看: 555 | 回復(fù): 1 | |||
drlml鐵蟲 (初入文壇)
|
[交流]
英國謝菲爾德大學(xué)(QS Top100) 招收計算機博士, 10月底申請截止!
|
|
We are recruiting a PhD student to develop new algorithms for reinforcement learning from human feedback (RLHF), to effectively solve complex reinforcement learning tasks without a predefined reward function. The primary goal of this project will be the development of a novel RLHF framework that can learn more complex behaviours while requiring significantly less interactive human feedback than current RLHF methods. The direction of this project is highly flexible, and the student will have the opportunity to explore related directions that match their research interests. We intend for this project to explore applications of the new RLHF framework, such as fine-tuning and aligning large language models (LLMs), and the use of human feedback in robotics. The project may also explore the use of LLMs as part of the RLHF framework itself, to generate and/or interpret natural language feedback. The specific applications and research directions will depend on the student's own interests. The preferred starting date for this position would be in February 2026, but this is very flexible. Supervisors: Dr. Bei Peng, Dr. Robert Loftin Application deadline: October 31, 2025 Requirements: 1. A Bachelor's or Master's degree in Computer Science, Mathematics, or related field. 2. Solid programming skills and mathematical background in machine learning/reinforcement learning. 3. Proficiency in programming languages such as Python and familiarity with common deep learning and machine learning frameworks. 4. Good English communication skills, with an IELTS score of 6.5 or above (with no less than 6.0 in each component). Scholarship information: For UK home students, this is a fully funded 3.5-year PhD studentship. For international students, you will need to pay the difference between the UK and overseas tuition fees by securing additional funding or self-funding (i.e., the PhD studentship will cover tuition fees but not living expenses). More information and instructions for how to apply can be found here: https://www.findaphd.com/phds/project/improving-deep-reinforcement-learning-through-interactive-human-feedback/?p186459 (When applying, make sure you name Dr. Bei Peng and Dr. Robert Loftin as your proposed supervisors.) If you have any questions regarding the position, feel free to contact Dr. Bei Peng (bei.peng@sheffield.ac.uk) |
| 2 | 1/1 | 返回列表 |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 材料371求調(diào)劑 +8 | 鱷魚? 2026-03-11 | 10/500 |
|
|---|---|---|---|---|
|
[考研] 283求調(diào)劑,材料、化工皆可 +6 | 蘇打水7777 2026-03-11 | 6/300 |
|
|
[考研] 工科278分求調(diào)劑 +3 | 周慢熱啊 2026-03-12 | 5/250 |
|
|
[考研] 302求調(diào)劑 +3 | 負(fù)心者當(dāng)誅 2026-03-11 | 3/150 |
|
|
[考研] 材料與化工304求B區(qū)調(diào)劑 +4 | 邱gl 2026-03-11 | 5/250 |
|
|
[考研] 0703化學(xué)一志愿211 總分320求調(diào)劑 +3 | 瑪卡巴卡啊哈 2026-03-11 | 3/150 |
|
|
[考研] 材料與化工(0856)304求B區(qū)調(diào)劑 +6 | 邱gl 2026-03-10 | 9/450 |
|
|
[考研] 泣血叩求調(diào)劑恩,愿以丹心報師恩 +5 | Iuruoh 2026-03-11 | 5/250 |
|
|
[考研] 接受26屆調(diào)劑生 +22 | 豬豬豬毛 2026-03-06 | 23/1150 |
|
|
[考研] 288求調(diào)劑 +13 | 王曉陽- 2026-03-09 | 18/900 |
|
|
[考研] 材料工程307,求調(diào)劑 +7 | 我要燃燒你的夢 2026-03-08 | 7/350 |
|
|
[考研] 298求調(diào)劑 +3 | Vv呀! 2026-03-10 | 3/150 |
|
|
[考研] 一志愿山東大學(xué),總分327,英語二79,有論文,有競賽,已過四六級 +3 | 木木目目1 2026-03-09 | 3/150 |
|
|
[考研] 083000環(huán)境科學(xué)與工程調(diào)劑 +5 | 加油呀fxy 2026-03-07 | 6/300 |
|
|
[考研] 安徽農(nóng)業(yè)大學(xué)材料與化學(xué)學(xué)院0856材料招收調(diào)劑 +3 | akakk47 2026-03-05 | 9/450 |
|
|
[考研] 286求調(diào)劑 +12 | Faune 2026-03-06 | 14/700 |
|
|
[考研]
|
Sixuan wang 2026-03-06 | 7/350 |
|
|
[考研] 081700學(xué)碩一志愿北京化工大學(xué)數(shù)二英一過六級有競賽求調(diào)劑 +5 | galaxary 2026-03-07 | 7/350 |
|
|
[考研] 一志愿中國石油大學(xué)(華東) 本科齊魯工業(yè)大學(xué) 求調(diào)劑 +3 | snw石 2026-03-07 | 3/150 |
|
|
[考博] 2026申博自薦 六級440電催化方向 +4 | 櫻落成影花成雙 2026-03-05 | 4/200 |
|