If you intend to participate to GSI:detect and would like to stay informed about the task, please fill the expression of interest form HERE.
Your expression of interest is not binding but it will help us to organize the task.
Participants will be allowed to submit a maximum total of 15 runs per model/system as follows:
Runtype: Zero-shot (mandatory, up to 5 runs);
Runtype: Few-shot (up to 5 runs);
Runtype: Fine-tuned (up to 5 runs).
Use of data for in-context learning
Participants may also use the guidelines_for_annotation.pdf file included in the zipped dataset, that can also be consulted here.
Use of data for fine-tuning
The development set is released under CC-BY-NC-SA 4.0 license.
There are no restrictions in the use of additional resources. Important: all data used for fine-tuning must be specified in your system report.
Required output
The output must be provided in .jsonl format, where each line represents a single JSON object containing the following fields:
{
"id": "unique_identifier",
"text": "Italian_short_text",
"gs_value": "number_between_0_and_1_with_two_decimals-predicted",
"gs_category": "predicted_gender_stereotype_category_[role, personality, competence, physical, sexual, relational]"
}
After running your system(s) on the raw test data, submit your results according to the recommendations below and those provided on the EVALITA website, in line with the Important Dates.
File Naming:
choose a team name and name your run files according with the format: gsidetect_TeamName_SystemName_Runtype_RunNumber
specify also the number of example used in few-shot runtype (e.g., gsidetect_TeamName_SystemName_few-shot10_RunNumber )
Submission Package:
Compress all relevant files into a single .zip archive.
Send it via email to: gsievalita@gmail.com
Subject line: “gsidetect_outputs – TeamName”.
Technical report
After the evaluation, submit a report using the template and guidelines that will be published soon on the EVALITA website.
Your report must include:
a. Detailed description of the methodology;
b. All data resources used, including any additional data used;
c. Experimental details — prompts, preprocessing, hyper-parameter tuning, etc.
d. Analysis of results.