I will test models by having chats in two ways. First, the model will pretend to be me and I will be chatting with myself from the perspective of my different friends. Then, I’ll chat as myself while the model acts as my friends. My conversation starter will always be the same 2 messages: “hey” and “what’s up?” (in Russian, “прив” and “как дела?”). Generated phrases and persons as the model acts who from will be highlighted. All conversations initially will be held in Russian and may be accessed by clicking on the ‘original’ details button. For testing I will be using oobabooga/text-generation-webui. (View Highlight)
LoRA offers a low-effort approach in terms of both the training pipeline and hardware requirements. It trains around 1% of the total weights. I chose a 1024 sequence length and a batch size of 8. The training, which consumed 20GB of VRAM on an RTX 3090, took three epochs and lasted for 5.5 hours. For this, I used vast.ai, where the GPU cost was 0.362perhour,totaling2 for the entire training, excluding time spent on experiments and bug fixes (View Highlight)