Your resource for web content, online publishing
and the distribution of digital products.
S M T W T F S
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
 
 

Few-shot In-Context Preference Learning Using Large Language Models: Full Prompts and ICPL Details

Tags: google mobile
DATE POSTED:December 3, 2024
Table of Links
  1. Abstract and Introduction
  2. Related Work
  3. Problem Definition
  4. Method
  5. Experiments
  6. Conclusion and References

\ A. Appendix

A.1. Full Prompts and A.2 ICPL Details

A. 3 Baseline Details

A.4 Environment Details

A.5 Proxy Human Preference

A.6 Human-in-the-Loop Preference

A APPENDIX

We would suggest visiting https://sites.google.com/view/few-shot-icpl/home for more information and videos.

A.1 FULL PROMPTS

\  Initial System Prompts of Synthesizing Reward Functions

\  Feedback Prompts

\  Prompts of Tips for Writing Reward Functions

\  Prompts of Describing Differences

A.2 ICPL DETAILS

The full pseudocode of ICPL is listed in Algo. 2.

\

:::info Authors:

(1) Chao Yu, Tsinghua University;

(2) Hong Lu, Tsinghua University;

(3) Jiaxuan Gao, Tsinghua University;

(4) Qixin Tan, Tsinghua University;

(5) Xinting Yang, Tsinghua University;

(6) Yu Wang, with equal advising from Tsinghua University;

(7) Yi Wu, with equal advising from Tsinghua University and the Shanghai Qi Zhi Institute;

(8) Eugene Vinitsky, with equal advising from New York University ([email protected]).

:::

:::info This paper is available on arxiv under CC 4.0 license.

:::

\

Tags: google mobile