Exploring the effectiveness of methods for persona extraction
    
    
  
  
  
      
      
      
        
Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part IV, Tome 540 (2024), pp. 61-81
    
  
  
  
  
  
    
      
      
        
      
      
      
    Voir la notice de l'article provenant de la source Math-Net.Ru
            
              			The paper presents a study of methods for extracting information about dialogue participants and evaluating their performance in Russian. To train models for this task, the Multi-Session Chat dataset was translated into Russian using multiple translation models, resulting in improved data quality. A metric based on the F-score concept is presented to evaluate the effectiveness of the extraction models. The metric uses a trained classifier to identify the dialogue participant to whom the persona belongs. Experiments were conducted on MBart, FRED-T5, Starling-7B, which is based on the Mistral, and Encoder2Encoder models. The results demonstrated that all models exhibited an insufficient level of recall in the persona extraction task. The incorporation of the NCE Loss improved the model's precision at the expense of its recall. Furthermore, increasing the model's size led to enhanced extraction of personas.
			
            
            
            
          
        
      @article{ZNSL_2024_540_a3,
     author = {K. Zaitsev},
     title = {Exploring the effectiveness of methods for persona extraction},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {61--81},
     publisher = {mathdoc},
     volume = {540},
     year = {2024},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2024_540_a3/}
}
                      
                      
                    K. Zaitsev. Exploring the effectiveness of methods for persona extraction. Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part IV, Tome 540 (2024), pp. 61-81. http://geodesic.mathdoc.fr/item/ZNSL_2024_540_a3/