Config Structure
Info
If you are not familiar with Hydra, please read our short introduction or the Hydra docs.
Our config is located in conf/ folder and consists of the following groups:
Backbone
Path: conf/backbone
Default: rugpt3large
Description: Defined the name of pretrained model and tokenizer.
Options
- rugpt3small - loads
sberbank-ai/rugpt3small_based_on_gpt2. - rugpt3medium - loads
sberbank-ai/rugpt3medium_based_on_gpt2. - rugpt3large - loads
sberbank-ai/rugpt3large_based_on_gpt2.
Option format
pretrained_model_name_or_path: <string>
Model
Path: conf/model
Default: default
Description: Creates a model.
Options
- default - loads an AutoLMHeadModel based on backbone option.
- gpt - the same as default, but loads a GPT2LMHeadModel.
Option format
An instantiatable config, returning an instance of pretrained model:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Tokenizer
Path: conf/tokenizer
Default: autotokenizer
Description: Creates a tokenizer.
Options
- autotokenizer - loads tokenizer based on backbone option.
- rugpt3 - the same as autotokenizer, but also adds missing special tokens.
Option format
An instantiatable config, returning an instance of pretrained tokenizer:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Dataset
Path: conf/dataset
Default: default
Description: Loads a dataset dict containing at least train and validation datasets.
Options
- default - loads a dataset dict using
datasets.load_datasetfunction. - from_jsonl - inherits from default, allows to load the dataset dict from json files.
Required fields:data_files.trainanddata_files.validation.
Usage example:dataset=from_jsonl data_files.train=/path/to/train.jsonl data_files.validation=/path/to/validation.jsonl.
Option format
An instantiatable config, returning an instance of dataset dict:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Preprocessing
Path: conf/preprocessing
Default: text2text
Description: Returns an instance of preprocessor.
Options
- text2text - creates an instance of
Text2TextPreprocessor.
Required fields match those of target class.
Option format
An instantiatable config, returning an instance of preprocessor:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Prompt Format
Path: conf/prompt_format
Default: default
Description: Defines the prompt format.
Options
- default - creates an instance of
PromptFormat.
Option format
An instantiatable config, returning an instance of prompt format:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Prompt Provider
Path: conf/prompt_provider
Default: tensor
Description: Defines the prompt provider.
Options
- tensor - creates an instance of
TensorPromptProvider. - lstm - creates an instance of
LSTMPromptProvider.
Option format
An instantiatable config, returning an instance of prompt provider:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Optimizer
Path: conf/optimizer
Default: adamw
Description: Defines the optimizer.
Options
- adamw - creates an instance of AdamW optimizer.
Option format
An instantiatable config, returning an instance of torch optimizer:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Scheduler
Path: conf/scheduler
Default: adamw
Description: Defines the learning rate schedule.
Options
- linear_schedule_with_warmup - creates a linear schedule.
- constant_schedule_with_warmup - creates a constant schedule.
Option format
An instantiatable config, returning an instance of torch lr scheduler:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Training arguments
Path: conf/training
Default: default
Description: Defines the training arguments.
Options
- default - creates an instance of
TrainingArguments.
Option format
No other options are assumed.
Callbacks
Path: conf/callbacks
Default:
- freeze_transformer_unfreeze_prompt
- reduce_checkpoint
- save_pretrained_prompt
- wb_log_hydra_config
Description: Selects the trainer callbacks.
Options
- freeze_transformer_unfreeze_prompt - creates an instance of
FreezeTransformerUnfreezePrompt. Freezes the pretrained transformer and unfreezes the prompt provider before training. - reduce_checkpoint - creates an instance of
ReduceCheckpoint. After each saving reduces the size of saved model by removing all weights but those of prompt provider. - save_pretrained_prompt - creates an instance of
SavePretrainedPrompt. Saves the trained prompt usingPrompt.save_pretrainedin each checkpoint. - wb_log_hydra_config - creates an instance of
WBLogHydraConfig. Logs the composed Hydra config to Weights and Biases before training.
Option format
An instantiatable config, returning an instance ofTrainerCallback:
_target_: <module>.<callable>
arg1: value1
arg2: value2
Task
Path: conf/task
Default: default
Description: Overrides the parameters of other groups.
Options
- text2text - selects
model=gpt,dataset=defaultandpreprocessing=text2text. - other configs that inherit text2text - should define required group parameters in their bodies.
Option format
task_name: detoxification
defaults:
- text2text
- /dataset: from_jsonl
dataset:
data_files:
train: /path/to/train.jsonl
validation: /path/to/validation.jsonl
prompt_format:
template: "<P*60>{toxic}<P*20>"
preprocessing:
target_field: "polite"
truncation_field: "toxic"
max_tokens: 1792