Abstract: Knowledge graph (KG) reasoning and completion under few-shot learning conditions faces significant difficulties because of the limited number of labelled triplets for rare or new relations.
Our code is based on open-r1, with our customized Trainer for mixed SFT+GRPO training. Some other updates focus on the white-box RL (reward function design) and post-completion training (replacement ...
My father worked as a tailor and clothing salesman. My mother left Cuba seeking opportunity. Neither attended college. I was the first in my family to go, and that experience taught me something ...