Config Structure
Info
If you are not familiar with Hydra, please read our short introduction or the Hydra docs.
Our config is located in conf/
folder and consists of the following groups:
Backbone
Path: conf/backbone
Default: rugpt3large
Description: Defined the name of pretrained model and tokenizer.
Options
- rugpt3small - loads
sberbank-ai/rugpt3small_based_on_gpt2
. - rugpt3medium - loads
sberbank-ai/rugpt3medium_based_on_gpt2
. - rugpt3large - loads
sberbank-ai/rugpt3large_based_on_gpt2
.
Option format
pretrained_model_name_or_path: <string>
Model
Path: conf/model
Default: default
Description: Creates a model.
Options
- default - loads an AutoLMHeadModel based on backbone option.
- gpt - the same as default, but loads a GPT2LMHeadModel.
Option format
An instantiatable config, returning an instance of pretrained model:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Tokenizer
Path: conf/tokenizer
Default: autotokenizer
Description: Creates a tokenizer.
Options
- autotokenizer - loads tokenizer based on backbone option.
- rugpt3 - the same as autotokenizer, but also adds missing special tokens.
Option format
An instantiatable config, returning an instance of pretrained tokenizer:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Dataset
Path: conf/dataset
Default: default
Description: Loads a dataset dict containing at least train
and validation
datasets.
Options
- default - loads a dataset dict using
datasets.load_dataset
function. - from_jsonl - inherits from default, allows to load the dataset dict from json files.
Required fields:data_files.train
anddata_files.validation
.
Usage example:dataset=from_jsonl data_files.train=/path/to/train.jsonl data_files.validation=/path/to/validation.jsonl
.
Option format
An instantiatable config, returning an instance of dataset dict:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Preprocessing
Path: conf/preprocessing
Default: text2text
Description: Returns an instance of preprocessor.
Options
- text2text - creates an instance of
Text2TextPreprocessor
.
Required fields match those of target class.
Option format
An instantiatable config, returning an instance of preprocessor:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Prompt Format
Path: conf/prompt_format
Default: default
Description: Defines the prompt format.
Options
- default - creates an instance of
PromptFormat
.
Option format
An instantiatable config, returning an instance of prompt format:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Prompt Provider
Path: conf/prompt_provider
Default: tensor
Description: Defines the prompt provider.
Options
- tensor - creates an instance of
TensorPromptProvider
. - lstm - creates an instance of
LSTMPromptProvider
.
Option format
An instantiatable config, returning an instance of prompt provider:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Optimizer
Path: conf/optimizer
Default: adamw
Description: Defines the optimizer.
Options
- adamw - creates an instance of AdamW optimizer.
Option format
An instantiatable config, returning an instance of torch optimizer:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Scheduler
Path: conf/scheduler
Default: adamw
Description: Defines the learning rate schedule.
Options
- linear_schedule_with_warmup - creates a linear schedule.
- constant_schedule_with_warmup - creates a constant schedule.
Option format
An instantiatable config, returning an instance of torch lr scheduler:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Training arguments
Path: conf/training
Default: default
Description: Defines the training arguments.
Options
- default - creates an instance of
TrainingArguments
.
Option format
No other options are assumed.
Callbacks
Path: conf/callbacks
Default:
- freeze_transformer_unfreeze_prompt
- reduce_checkpoint
- save_pretrained_prompt
- wb_log_hydra_config
Description: Selects the trainer callbacks.
Options
- freeze_transformer_unfreeze_prompt - creates an instance of
FreezeTransformerUnfreezePrompt
. Freezes the pretrained transformer and unfreezes the prompt provider before training. - reduce_checkpoint - creates an instance of
ReduceCheckpoint
. After each saving reduces the size of saved model by removing all weights but those of prompt provider. - save_pretrained_prompt - creates an instance of
SavePretrainedPrompt
. Saves the trained prompt usingPrompt.save_pretrained
in each checkpoint. - wb_log_hydra_config - creates an instance of
WBLogHydraConfig
. Logs the composed Hydra config to Weights and Biases before training.
Option format
An instantiatable config, returning an instance ofTrainerCallback
:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Task
Path: conf/task
Default: default
Description: Overrides the parameters of other groups.
Options
- text2text - selects
model=gpt
,dataset=default
andpreprocessing=text2text
. - other configs that inherit text2text - should define required group parameters in their bodies.
Option format
task_name: detoxification
defaults:
- text2text
- /dataset: from_jsonl
dataset:
data_files:
train: /path/to/train.jsonl
validation: /path/to/validation.jsonl
prompt_format:
template: "<P*60>{toxic}<P*20>"
preprocessing:
target_field: "polite"
truncation_field: "toxic"
max_tokens: 1792